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Abstract — We consider in this paper the problem of informa- 
tion reconciliation in the context of secret key agreement between 
two legitimate parties, Alice and Bob. Beginning the discussion 
with the secret key agreement model introduced by Ahlswede 
and Csiszar, the channel-type model with wiretapper, we study 
a protocol based on error correcting codes. The protocol can 
be adapted to changes in the communication channel extending 
the original source. The efficiency of the reconciliation is only 
limited by the quality of the code and, while transmitting more 
information than needed to reconcile Alice's and Bob's sequences, 
it does not reveal any more information on the original source 
than an ad-hoc code would have revealed. 



I, Introduction 

Lets start by considering the channel-type model with wire- 
tapper (CW) for secret key agreement introduced by Ahlswede 
and Csiszar 0] as shown in Fig. Q] In this model a legitimate 
party, Bob, and an eavesdropper, Eve, are both connected to 
another legitimate party, Alice, through a discrete memoryless 
channel (DMC). Alice generates a discrete sequence of n 
values, X n , while Bob and Eve observe the correlated outputs, 
Y n and Z n respectively, obtained after the transmission of X n 
over the DMC. Both outputs are characterised by transition 
probability Py,z\X> with each component of the sequences 
being the outcome of an independent use of the channel. 
Alice and Bob have also access to a public but authenticated 
channel used to distill a shared secret key from their correlated 
sequences. Public and authenticated means in this context that 
Eve has noiseless access to the information exchanged through 
the channel, but she is not able to tap the channel without being 
noticed. Therefore the integrity of the messages on the public 
channel is guaranteed. 




Fig. 1. Ahlswede and Csiszar's model CW. 



Protocols that distill a secret key usually divide the dis- 
tillation process in two different phases. In the first one, 
known as information reconciliation or simply reconciliation, 
Alice and Bob exchange redundant information over the 
public channel in order to eliminate any discrepancy in their 
correlated sequences, X n and Y n respectively. At the end of 
the reconciliation phase both parties have agreed on a shared 
secret string \, though in many cases x = X n . On the second 
phase, known as privacy amplification, Alice and Bob shrink 
their strings in order to wipe any information of the previously 
shared key that the eavesdropper could have on \ through Z n 
or through any communication over the public channel with 
information about the strings. This construction allows to split 
the secret key distillation process into two easier problems. 
This division is not necessarily suboptimal and, as it is shown 
in section |IV] under certain conditions Alice and Bob can 
achieve the maximal secret key rate. 

The paper is organised as follows: Section UD includes a 
review of the information reconciliation problem linking it 
with secret key agreement. Section|III]describes an information 
reconciliation protocol over an extended string; this protocol 
uses Wyner's coset scheme with Low-Density Parity-Check 
(LDPC) codes [2| and can achieve an efficiency as close to 
its optimum value as allowed by the quality of the code. In 
section [IV] it is proved that the proposed protocol does not 
reveal any more information on X than an adapted solution 
for string X would reveal. And finally, section [V] analyses the 
performance of this protocol in a practical scenario. 

II. Problem Statement 

Secret key distillation process is usually divided into privacy 
amplification and information reconciliation. This section de- 
fines the meaning of secret key in the context of this paper. 
Then privacy amplification and information reconciliation are 
introduced and linked. The objective is to highlight the in- 
fluence of efficient reconciliation in the achievable secret key 
rate. 

A. Secret Key Agreement 

Alice, Bob and Eve hold n-length sequences, X n , Y n and 
Z n respectively, with each component of the sequences being 
characterised by Py,z\x- 

Let (pi denote the message that Alice sends over the public 
channel in its i-th use, and ipi denote the message that Bob 



sends in his i-th use of the channel. Both sets of messages or 
communications, <f> and ip, are respectively known as forward 
and backward transmissions. Depending on the protocol any 
individual message or even <fi or ip might be null. The former 
case, G 0, is known as direct reconciliation while the latter, 
t/> € 0, is known as reverse reconciliation. After k uses of the 
public channel, i.e. after the exchange of the set of Alice's 
first k messages, 4> k , and Bob's, ijj k messages, Alice and Bob 
estimate their shared keys to be K and L respectively by using 
an agreed protocol. 

Definition 1: A strong secret key rate S is achievable if 
there exist (<f> k ,tjj k ) that for large enough n and for every 
e > that meets simultaneously the following restrictions J5): 



Pt[K ^ L] < e 



I(<j> k ,ip k ,Z n ;K)<e 



H{K) >n-S-e 



log \K\ < H{K) + e 



(1) 



(2) 



(3) 



(4) 



where H(-) stands for Shannon's entropy, while /(•; •) stands 
for Shannon's mutual information. This definition of secret 
key rate is strong compared to previous definitions in which 
the convergence of the conditions was asymptotic and not 
absolute. In [4| it is shown that both sets of conditions share 
the same bounds for secret key generation. 

Henceforth the superindex indicating length is dropped to 
reduce the notation, the length of the variable or string should 
be clear from the context, whenever in doubt we clarify the 
value that the superindex is taking. 

The largest achievable secret rate S is upper bounded by 
the secret key capacity, Cs, which if only forward communi- 
cations are allowed is defined by [1|: 



C Sf = max[I(U; Y) - I(U; Z)\ 



(5) 



where U is an auxiliary random variable that forms the Markov 
chain U ->■ X -> YZ. 

It should be noticed that Cg, is a lower bound of Cs if 
two way communications are allowed |5|. A case of special 
interest arises when U cannot be maximised or X cannot 
be manipulated by Alice, an example of this situation is a 
Quantum Key Distribution (QKD) protocol fixing X [6|. In 
this case, taking into account the restrictions, the previous 
result allows Alice and Bob to achieve at least a secret rate of 



I(X; Y) - I(X; Z) = H{X\Z) - H(X\Y) 



(6) 
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Fig. 2. Source coding with side information. 



B. Privacy Amplification and Information Reconciliation 

The problem of privacy amplification — how to reduce 
I(X; Z), the knowledge that Eve might have gathered during 
the process — has been widely studied. Some of the results 
on privacy amplification are based on the use of universal 
families of hash functions [7], however in this work we use 
extractors JS), proposed by Maurer and Wolf for privacy 
amplification [4|, as they allow to prove the strong secret 
key rate bounds. An extractor is a function that, with a small 
amount of random bits acting as catalyst, obtains a number of 
almost uniformly distributed random bits from a source. The 
main result, that we develop in section HVl states that given an 
upper bound on the information the eavesdropper has, Alice 
and Bob can extract a smaller and highly secret key. The length 
of the new key is a function of a security parameter and of the 
upper bound on Eve's information, which in turn depends on: 
the information that Eve gathers on the private channel and the 
information that Eve gathers in the information reconciliation 
phase, directly linking privacy amplification with information 
reconciliation. 

Information reconciliation in the context of secret key 
agreement is also a well known problem. Once it has been 
separated from privacy amplification, the problem is reduced to 
one of Slepian-Wolf coding [9| (see Fig.[2]i. Given a source X, 
it is sufficient a rate R > H(X) to losslessly encode X, and 
given two sources X and Y to an individual encoding terminal 
it is sufficient with R > H(X,Y). The surprising result 
by Slepian and Wolf states that even for separate encoding 
R > H(X,Y) is enough [9| and, of particular interest in 
information reconciliation, that it is also enough for Alice to 
encode her source X with R > H(X\Y) in order to allow 
Bob infer X. 

Wyner's coset scheme is a good solution for the compres- 
sion of binary sources with side information iflOll . ifTTl . The 
fundamental idea is to assign each source vector to a bin from 
a set of 2 H ( x \ Y ' >+e known bins. The encoder, Alice, transmits 
the bin number to the decoder, thus encoding X with rate 
R = H(X\Y) + e. The decoder looks for the source vector 
inside the described bin with help of the side information Y, 

The efficiency of an information reconciliation protocol 
sending a sequence C through the public channel to help Bob 
recover X using side information Y can be measured using a 
quality parameter /. If we allow | • | to stand for the length of 
a variable, / is defined by: 



where H(X\Y) and H(X\Z) are the Shannon conditional 
entropy. 



/ 



\C\ 



H{X\Y) 



> 1 



(7) 



According to this definition of efficiency, it takes its lowest 
value / = 1 in the optimal case, i.e. when the information 
published for reconciliation is the minimum possible informa- 
tion. 

C. Previous Work 

Several protocols have been studied for information recon- 
ciliation. Many of them have been discussed in the context 
of Quantum Key Distribution (QKD) as it is one of the main 
scenarios of real secret key distillation. 

Brassard and Salvail proposed the Cascade protocol in 
llT2l for binary variable reconciliation. Cascade despite being 
highly interactive remains the most widely used protocol. It 
offers to its advantage a simple description and a relatively low 
efficiency value. Other protocols include a protocol by Liu et 
al. lfl3l that combines advantage distillation and information 
reconciliation, and Winnow [14], a protocol in which Alice 
and Bob exchange the syndrome of a Hamming code for each 
block. 

LDPC codes have been proposed for coding correlated 
sources in |fT51 , though no explicit codes were given. A 
rate adaptive contraction with non binary LDPC codes was 
proposed in ||T6l . On ifTTl LDPC codes were optimised for 
the binary symmetric channel (BSC) and used to reconcile 
binary variables. The efficiency of the codes was close to 1 
for crossover probabilities near the codes' thresholds, however 
as only a discrete number of codes was available the efficiency 
exhibited a saw behaviour (see Fig.[5j. A rate adaptive protocol 
was proposed in [ 18 1, however the security of the protocol was 
not addressed and the impact of the excess of information on 
the public channel was not discussed. 

III. Rate Adaptive Information Reconciliation 
A. Formalism 

In this section we describe a protocol for the information 
reconciliation problem based on Wyner's coset scheme, briefly 
sketched above. Before describing the protocol we review 
some basic formalism. 

Let C(n, fc) be a binary linear code of length n, k infor- 
mation symbols, and i?o = k/n its information rate. This 
code can be specified by a parity matrix H. Let x be a n- 
length vector, such that m(x) = Hx T stands for the syndrome 
of x. The code C,(n, k) contains every n-length vector v 
such that m(v) = 0. The best way to choose the bins for 
Wyner's schema, is to choose bins with a structure that allows 
differentiating between them. One natural way is to assign a 
bin to each coset of a linear code [11 J. Each bin can be seen 
as an affine code, characterised by syndrome rrib, that contains 
every n-length vector v such that m(v) = to&. There are 2 n ~ k 
different syndromes, thus allowing Alice to encode x with rate 
(n — k)/n. 

It was first shown in |fl9l and generalised in lfl5l that 
LDPC codes can be successfully used in order to address the 
problem of coding correlated sources with side information at 
the decoder. The message passing decoder must be modified to 
take into account the different syndromes and, channel coding 



techniques that lead to channel capacity approaching codes, 
lead also to codes approaching the Slepian-Wolf limit [17|. 
However, a linear code reveals a fixed amount of information 
independently of the channel characteristics which might not 
be appropriate in many situations. An scenario with changing 
statistics can arise in real settings due, for example, to the 
sensitivity of physical devices or to the presence of an active 
eavesdropper. To address the problem of secret key agreement 
when the statistics of the channel can vary from execution to 
execution, a suitable solution is provided by puncturing and 
shortening strategies (see Fig. |5J. 

A punctured code modifies an existing £(n, fc) code by 
removal of a set of p from the total n symbols, thus becoming 
a code of length n — p and dimension k, £'(n — p, fc). In the 
same fashion, a shortened code is a modified code in which 
s symbols from the code are known or fixed. A shortened 
code becomes a code of length n — s and dimension fc — s, 
£'(n — s,k — s). A code ((n,k) in which p symbols are 
punctured and s symbols are shortened becomes a code with 
rate: 



R 



P 



(8) 



This expression can also be written as a function of Rq, 
a = s/n and n = p/n: the original coding rate, the fraction 
of shortened symbols and the fraction of punctured symbols, 
respectively. Puncturing and shortening provide the means to 
adapt the rate of an existing code, however once chosen p 
and s the new rate is fixed. It should also be noted that there 
is a certain amount of efficiency loss as the percentage of 
punctured and shortened bits increases and even a limiting 
threshold of puncturing depending on the code [20|. 

B. Rate Adaptive Protocol 

The following definition delineates a generic protocol able 
to adapt the information rate to varying channel parameters 
through puncturing and shortening strategies, s + p random 
bits are added to the original strings. The protocol transmits 
s + n — k bits through the public channel, which can stand for 
the code syndrome and s shortened bits. 

Definition 2: Let ((n,k) be a linear code and s,p e N 
two parameters such that < s < fc, < p < n, s + p < 
n. An sp-protocol allows two parties holding x and y two 
(n — p— s)-length binary sequences to reconcile their strings. 
This protocol transmits s + n — k bits through a public channel 
and extends both sequences x and y with s + p random bits 
into x and y two n-length sequences. 

We now describe a practical sp-protocol which is a formal 
and simplified version of a protocol described in |[T8l adapted 
for easier analysis. Let Rq be the rate of £(n, fc), in order to 
reconcile their string the two parties Alice and Bob perform 
the following steps: 

Step 0: Alice and Bob fix a parameter S = a + ix standing 
for the number of symbols to either puncture or shorten, this 
allows them to reconcile the same amount of information on 
each protocol execution. They characterise as well /(p e rr), 
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Fig. 3. Example of Tanner graph of an LDPC code with puncturing and shortening strategies applied on only one symbol. 
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Fig. 4. Extended string construction. It is shown how the extended string x 
is constructed from a random permutation of two strings: the original string to 
be reconciled, x, and a string consisting of punctured and shortened symbols, 



the efficiency function describing the behaviour of the code 
under shortening and puncturing, where p crl stands for the 
error probability. 

Prior to the execution of the protocol Alice and Bob might 
have an estimate of the discrepancies between their strings or 
they might have published and subsequently discarded a subset 
of their original strings in order to infer the error probability 
Pcrr- Once estimated p orr and having measured the quality of 
£ under perforation and shortening, Alice and Bob choose s 
and p such that R the rate of the equivalent code allows to 
reconcile both strings with high probability while minimising 
s. 

Step 1: Alice creates an extended string x (see Fig. |4j: 



X = 9 (x\r A (p)\r A (s)) 



(9) 



where g is a permutation of x\rA(p)\rA(s), »~a(p) is a random 
string of length p, and 7"a(s) is a random string of length s. 

Alice transmits to Bob m(x) and ta{s). 

Step 2: Bob receives Alice message and constructs an 
extended string y: 



where rs (p) is a random string of length p generated by Bob. 

Bob recovers x with high probability using the modified 
belief propagation decoder described in [19). 

Example 1: Alice and Bob have a code ((2 x 10 5 , 10 5 ) 
with an empirical efficiency below f(p crr ) < 1-09 in the range 
[0.065, 0.075] (see Fig.© for 8 = 0.05. Alice transmits to Bob 
a string of length 1.9 x 10 5 over a BSC with known crossover 
probability p CII = 0.068. In a BSC the conditional entropy can 
be expressed as H(X\Y) = h(p cir ), and thus the maximum 
coding rate R = 1 — .f(p C n)h(p crl ). Then, from Eq. [H] they 
should puncture p = 5, 772 bits and shorten s = 4, 228 bits 
to reconcile their extended strings with high probability and 
/ < 1-09. 

An important remark here is that Alice and Bob reconcile 
their extended strings with efficiency / close to 1, while 
/, as defined on Eq. [7] for reconciling the original strings, 
is higher. In the next section we show that the amount of 
distillable secret bits is not diminished by the higher / value 
and, indeed, the relevant figure is the reconciliation efficiency 
of the extended strings. 

IV. Security Analysis 

The security of sp-protocols is addressed in this section. 
As a first step we review the privacy amplification results that 
allow to take into account the impact of reconciliation in the 
final key. 

We introduce another entropy measure: min-entropy, as it 
is used in the following discussion. It is defined as: 



Hoo(X) = -logmaxPx(x) 



(11) 



Generally H OQ (X) < H(X), being equal only if X out- 
comes are given by a uniform distribution. We further define 
the conditional min-entropy as: 



y = g{y\rB(p)\rA(s)) 



(10) 



ffoopqr) = miniJ 00 (X|F = y) 
v 



(12) 



Theorem 1: Given three constants S, Ai, A2 > 0, after n 
uses of a binary symmetric channel ruled by Pznxt if Eve's 
min-entropy on X is known to be bounded as H 00 (X\Z' = 
z') > Sn, there exists (||4]) an extractor function E : F% x 
F 2 -> F 2> witn u < ^i n and k > (6 — A 2 )rc, such that if 
Alice and Bob agree on secret key K = i?(X, 17), where U 
is a sequence of u random uniform bits, the entropy of K is 
given by: 



H(K\U,Z' = z')>k-2 



_„l/2-o(D 



(13) 



which wipes all the information from the eavesdropper pro- 
vided that Alice and Bob can estimate Hoo(K\Z'), 

The effects of the \C\ redundancy bits shared on the 
conditional min-entropy can also be bounded using a security 
parameter t with probability 1 — 2~* 10): 



H^XIZ' = zc) > H X (X\Z = z)-\C\-t 



(14) 



measuring the interest of good information reconciliation, 
every redundancy bit used in this phase reduces the final secret 
key. 

We proceed to demonstrate that the use of an sp-protocol 
does not impose any constraint on the achievable secret key 
rate. Moreover, from this demonstration it is possible to infer 
that the quality of the information reconciliation procedure 
depends only on the quality of the error correction code. We 
begin with the proof of the following lemma (Lemma [TJ that 
allows to exploit the random construction of the punctured and 
shortened bits in the proposed protocol. 

Lemma 1: Let X, Y and Z be three random variables, if 
Y is independent from variables X and Z the mutual min- 
entropy of X and Y conditioned to Z can be expressed by: 



H oa {XY\Z) = H 0C (X\Z) + H 0O {Y) (15) 



Proof: 



H OQ (XY\Z)= mm H OQ (XY\Z=z) (16) 



then the min-entropy of the variable X constructed by the sp- 
protocol, is with probability 1 — 2 _ * greater or equal than that 
of using an adapted error correcting code of rate R to reconcile 
X and Y minus the security constant: 

H 00 {X\ZC)>H 00 {X\Z)-\X\{l-R)-t (21) 

Proof: 
Directly given by Eq. [14] 



H^{X\ZC) > H^XIZ) -\C\-t 



(22) 



Distinguishing in X part of the variable that corresponds 
to the sequence to be reconciled, X, and the additional 
variable used to extend the original sequence, X' (see its 
correspondence with strings in Fig. |4): 



HniXX'ffl-lCl-t 



(23) 



Since X 1 is independent of Z and X by construction, 
Lemma Q] can be applied: 



H^X^) + H^X') -\C\-t 



(24) 



The entropy of H 00 (X') takes the value of the number of 
random p + s bits: 



= H oc (X\Z) + \X\ 



-\C\-t 



(25) 



The length of the conversation \C\ is s + n — k, which in 
the proposed protocol stand for the s shortened bits and the 
syndrome of X' . It can be written as a function of the size of 
X, ir and a: 



= H 00 (X\Z) + \X\ 



tt + o- _ (l-.Ro) 



1 — n — a 



1 — ix — a 



and thus 



H X (X\Z) - \X\{1 - R) - t 



-t (26) 



(27) 



min log max P(xy \z) 

z xy 



(17) 



= — minlogmaxP(x|z)P(y|z) (18) 

z xy 



logmaxP(a;|z) + logmaxP(y|2;) (19) 

x y 



= H^XIZ) + H^{Y) (20) 

where Eq. [18] derives from the consideration that X and Y 
being independent variables, and Eq. [20] from Y and Z being 
independent variables. 

■ 

Theorem 2: Given a code £(n, k), a security constant t, the 

public communication C, and Z the eavesdropper information, 



V. Numerical Results 

We discuss the efficiency of several protocols in this section. 
In order to illustrate the performance of the sp-protocol in 
Fig. [5] we compare the results of adapted LDPC codes to reg- 
ular LDPC codes without adaptation and to Cascade. We show 
as well the theoretical efficiency in case of infinite length |[T8l . 
this curve indicates the expected asymptotic behaviour of the 
protocol. 

Following Theorem[2]two strings can be reconciled with the 
efficiency of a rate adapted code. In the figure, the efficiency of 
the punctured and shortened codes is below 1.1 in the whole 
range of p crr , close to the theoretical limit. In comparison 
the codes without adaptation offer a better result close to their 
threshold but the efficiency quickly drops as the working point 
moves away from the threshold. On the other hand Cascade 
exhibits a poorer efficiency on the p CII range considered. 



1.05 - 



1 

0.05 




>***V»5***%^ 



\ 






0.07 
Bit Error Rate (BER) 



# «={0.65, 0.6, 0.55}, 5=0 Experimental 
R=0.6, 5=0.05 Efficiency 
fl=0.65, 5=0.05 

♦ fl=0.6, 5=0.1 Cascade 



Theoretical Efficiency 



= 0.6, 5 = 
= 0.6,5 = 



0.05 
0.1 



Fig. 5. Reconciliation efficiency of Cascade [12], LDPC codes without 
puncturing and shortening strategies UJ] , and the sp-protocol in a practical 
setting as defined in Eq. [7] Two LDPC codes have been chosen to cover 
the crossover range p crr £ [0.055, 0.08] using the proposed sp-protocol. 
Both codes, £l(2 x 10 5 , 1.2 X 10 5 ) with coding rate R = 0.6 and C 2 (2 X 
10 5 , 1.3 X 10 5 ) with coding rate R = 0.65, allow to cover the range with 
S = 0.05, while £2 with r5 = 0.1 also covers the range of interest. A third 
code with rate R = 0.55 has been used in order to compare the efficiency of 
the studied crossover range with a direct strategy, i.e. without using puncturing 
or shortening, as proposed in 1181 . 



VI. Conclusion 

On this paper it has been discussed the problem of informa- 
tion reconciliation in the context of secret key agreement. The 
sp-protocol, a simple protocol based on puncturing and short- 
ening LDPC codes has been proposed. This protocol allows 
the eavesdropper to gather the same amount of information 
than an adapted code would reveal; even if it is exchanged 
more data on the public channel. 

It had been argued that information reconciliation based 
on error correction codes was not optimal for channels with 
changing characteristics ifTTl . having Alice and Bob access 
to a discrete set of codes the efficiency of the reconciliation 
exhibits a saw behavior. The sp-protocol allow Alice and Bob 
to reconcile their chains with a continuous efficiency curve, 
and as the efficiency of LDPC codes under puncturing and 
shortening can be analytically described and optimised, the 
results proved in this paper allow to address the information 
reconciliation problem as a code design problem. The numer- 
ical data on section [V] indicate that efficiency values close to 
the theoretical limits can be obtained. 
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